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Abstract 

The problem of reconstruction of digital images from their degraded measurements is regarded as a problem of 
central importance in various fields of engineering and imaging sciences. In such cases, the degradation is typically 
caused by the resolution limitations of an imaging device in use and/or by the destructive influence of measurement 
noise. Specifically, when the noise obeys a Poisson probability law, standard approaches to the problem of image 
reconstruction are based on using fixed-point algorithms which follow the methodology first proposed by Richardson 
and Lucy. The practice of using these methods, however, shows that their convergence properties tend to deteriorate 
at relatively high noise levels. Accordingly, in the present paper, a novel method for de-noising and/or de-blurring of 
digital images corrupted by Poisson noise is introduced. The proposed method is derived under the assumption that 
the image of interest can be sparsely represented in the domain of a linear transform. Consequently, a shrinkage-based 
iterative procedure is proposed, which guarantees the solution to converge to the global maximizer of an associated 
maximum-a-posteriori criterion. It is shown in a series of computer-simulated experiments that the proposed method 
outperforms a number of existing alternatives in terms of stability, precision, and computational efficiency. 
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I. Introduction 

The image formation model of many key imaging modalities rely on the notion of event counts. The latter, for 
example, quantifies the number of gamma photons which pass though a single slit of the collimator of a gamma 
camera in positron emission tomography (PET) and single positron emission computer tomography (SPECT) 1 1 1- 
(3). In addition, statistical models of the same type are routinely used in optics to account for the process of 
"counting" the number of optical photons registered by a sensor of a (CCD) camera [4|, |5 1. Confocal microscopy [6|, 
astronomical imaging |7}, and turbulent imaging [8| are among other fields of imaging sciences where the notion 
of event counts is standardly used. 

In the most general setup, the image formation model of the above-mentioned imaging modalities assumes that a 
data image is formed as a blurred version of the original (i.e. true) image contaminated by Poisson noise. Specifically, 
if <?(x) e Z + (with x G fl C Z 2 ) denotes the observed Poisson counts image and / (x) <E R + is its corresponding 
original counterpart, then the image formation model can be formally expressed as given by |9j Section 7.3] 

g = V{H[f}}, (1) 

where V stands for the operation of contamination of H[/] by Poisson noise and H denotes the operator of 
convolution with a (known) point spread function (PSF) h, where h is assumed to be positive and mean preserving. 
Accordingly, to recover the original image / from g, the combined effect of "P{H[-]} has to be inverted. 

The model of ([T]l has long been in use in a variety of different applications. For example, in |6 | the model describes 
the formation of microscopic images, which are subsequently enhanced by means of a maximum-likelihood (ML) 
procedure, also known as the Richardson-Lucy algorithm fT0| , fTT) . The same model and the same reconstruction 
approach has been used in the field of nuclear imaging, where they have been employed for recovery of PET and 



SPECT scans [12|, |T3| . It was argued in [14], however, that the ML approach may lead to either unstable or 
noisy results in the case of poorly conditioned operators H. To alleviate this problem, it was proposed in Jl4"[ 
to replace the ML approach by maximum-a-posteriori (MAP) estimation, which provides means to regularize the 
inversion of H via incorporating some prior information about / in the process of its reconstruction. Specifically, 
[14 1 suggested using either the Tikhonov -Miller or bounded variation models for /. Unfortunately, as will be shown 
in the experimental part of the present paper, the above algorithms cannot fully alleviate the instability problem 
intrinsic in the inversion of (QJ. To overcome this deficiency, the work in p3| suggested a different approach also 
based on the bounded variation model which leads to a minimum total-variation (TV) solution for the true image. 
In this case, to find a solution to ([T]i, a variable splitting procedure of | |16| was employed. Despite a substantial 



improvement in the stability of reconstruction, the computational cost of 1 15 1 is relatively high, which undermines 
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the applicability of this method for processing large data sets. Moreover, the algorithm of |15| may produce non- 
positives solutions, which are not natural in the case of Poisson imaging. To overcome the computational inefficiency 
of p7j , a modification of the variable splitting method was proposed in |17| . Unfortunately, this modification seems 
to "restore" the instability concerns. 

A different class of reconstruction methods has been recently proposed based on the assumption of compressibility 
of the true image /. This assumption suggests that / can be sparsely represented either in the spatial domain or 
in the domain of a certain linear transform. Thus, for example, [18 J analyzes three different penalty functions, 
whose role is to impose the sparsity constraint on the estimated images. Although the proposed algorithms have 
exceptional stability and computation properties, they require the original image to be sparse in the spatial domain 
- the assumption which does not seem to be natural for a large spectrum of practical images. In this regard, a more 
general assumption has been used in [19], where the true images are supposed to be sparsely represented in the 
domain of a tight frame. This method, however, is based on "gaussianization" of the Poisson noise by means of a 
variance stabilization transform (VST) [20| that is known to perform inadequately in low-count settings. Another 
work that employed the assumption on / to be sparsely representable in the domain of an orthonormal transform 
is pT) , where the sparsity-constraned Poisson inverse problem is solved in an approximate way by minimizing a 
sequence of I2 — £1 objective functions. However, as acknowledged by the authors of pT) , their algorithm has a 
drawback of slow convergence, which should be addressed in future research. As well, one should also mention the 
method of [22|, in which an estimate of the true image / is obtained based on maximization of an ML criterion 



penalized by the ^o- norm of the platelet coefficients of the estimate. It is not clear, however, if the method of |22| 
could be used for a different type of image representations (rather than platelets), let alone the necessity to augment 
the performance of J22) by the computationally expensive cycle-spinning procedure of (23) to optimize the quality 
of resulting estimates. 

In the case when the effect of H can be neglected, a different family of reconstruction methods has been suggested 
based on the use of VST |20) . In particular, in such methods, the data image g is first preprocessed by a VST that 
transforms the Poisson noise into approximately homogeneous and Gaussian p4)-p7). Subsequently, the true image 
is recovered from the "gaussianized" data using an appropriate statistical framework. Among other reconstruction 
techniques for solving ([T]) with H«I are wavelet Wiener filtering fl28") , hypothesis testing |29| , as well as empirical 
Bayesian approaches p0| , |3T) . 

In the current paper, a different method for reconstruction of the true image / in ([T]i is introduced. As opposed to 
the algorithms described above, the proposed method is exceptional for it concurrently fulfills a number of essential 
objectives, viz. 

1) Exactness: No auxiliary transformations and/or approximations are applied to the data image g to modify the 
properties of measurement noise, and therefore the method is applied under realistic statistical assumptions 
regarding the noise nature. 

2) Generality: The image formation model employed by the proposed method is designed to accommodate a 
number of reconstruction scenarios, namely de-noising, de-blurring, and a combination thereof. As a result, the 
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very same procedure can be applied to recover an image of interest for a spectrum of different degradations. 
3) Versatility: The proposed reconstruction is carried out under the assumption that the original image / can 
be sparsely represented in the domain of a linear transform. This assumption is currently recognized to be 



superior to many alternative models, including the Tikhonov -Miller and total variation models [32 1. 

4) Efficiency: The proposed solution is based on the concept of iterative shrinkage p2)-p4[ as a modern, 
computationally efficient, and stable procedure for solving a variety of inverse problems. In this way, the 
present contribution extends the theory of iterative shrinkage (aka thresholding) to the case of problems 
concerned with Poisson noises. 

5) Uniqueness: Under a certain setup, the proposed algorithm is guaranteed to converge to the global maximizer 
of a maximum-a-posteriori criterion, thereby providing a unique solution to the reconstruction problem at 
hand. 

Moreover, in the experimental part of the paper, it is shown that the proposed reconstruction method outperforms 



a number of alternative algorithms in terms of normalized mean-square error (NMSE), SSIM quality index 1 35 1, as 
well as its stability and computational efficiency. 

The rest of the paper is organized as follows. Section II describes model assumptions and derives an optimization 
framework to be used in the following sections. Section III details a bound optimization approach, which results 
in an iterative shrinkage procedure. Section IV provides a proof of convergence of the proposed algorithm, while 
Section V presents a number of in silico reconstruction results. Section VI finalizes the paper with a discussion 
and conclusions. 

II. Estimation Framework 

A. Image modeling 

The proposed approach for the reconstruction of / in ([lj is based on the assumption that / can be sparsely 
represented in the domain of a certain linear transform. Specifically, let {ifik}kex be a frame in the space of the 
original image / (36). (Note that I above denotes a set of frame indices, whose definition depends on a specific 
choice of tp k .) Using the frame, / can be represented according to 

/ = *(c) i/^^Tcfc^, (2) 

hex 

where c £ ^(Z) denotes the representation coefficients of / in spa,n{(pk}kex, such that the number of significant Cfc 
is much smaller than #Z, i.e. the total number of <pk- It is worthwhile noting that the model of Q is standard in the 
theory of sparse representation which has firmly reserved a leading position among the modern tools of signal and 
image processing. The value of sparse representations has been demonstrated in numerous fields and applications, 



which include signal modeling p7| , compressed sensing f38) , |39|, independent component analysis and blind 
source separation |40| , pT[ , inverse problems [33 1, ]42) , signal and image de-noising |43j, [44 1, morphological 
component analysis [45] as well as its earlier version in [46 1, just to name a few. 
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When expressed as a function of the representation coefficients c, the image formation model ([T]l becomes 

g = V{U[^(c)}}=V{A[c}}, (3) 

where A = H • <I> is a composition map from £2 (I) to the signal space, which represents the combined effect of 
image synthesis and blur. In the discussion below, we will be mainly interested in the following three settings: 

• Setting 1: The blur is negligible and hence A = <&. In this case, the problem becomes that of image de-noising. 

• Setting 2: The basis €> is the canonical (Dirac) basis (i.e. 4> = I) and hence A = H. In this case, the image 
/(x) is identified with its representation coefficients c, and the problem becomes that of sparse deconvolution. 

• Setting 3: Neither H nor 3? can be simplified/neglected. In this case, the estimation problem at hand becomes 
that of sparse reconstruction. 

For the sake of generality, in what follows the specific nature of A will not be specified until the experimental part 
of the paper, where different reconstruction examples are presented. In particular, A is considered to be a linear 
map from the discrete domain of the representation coefficients c £ £2(1) to the space of blurred images. 

It is well known that in the case of a poorly conditioned operator A, the problem of recovering c E £2 (2) from 
noisy measurements of A[c] can be highly unstable (in the sense that there will not be a continuous dependency 
between the data and an estimate of c). In order to overcome this deficiency, we employ the framework of MAP 
estimation which provides the most likely solution, given the observed data and a reasonable assumption regarding 
the statistical nature of the true coefficients c fl47) . The details on this approach are presented next. 

B. MAP estimation 

Since most of the modern imaging devices work with finite data, it seems to be appropriate to restrict the subset 
51 C Z 2 to be a finite lattice. Accordingly, the measurement g and the original image / are assumed to be of size 
N x M and K x L, respectively. In this case, the range of A becomes a set of bounded N X M matrices. 

To compensate for the information loss inflicted by the poor conditioning of A, the MAP estimator takes advantage 
of a priori knowledge regarding the quantity of interest, i.e. c. In the case at hand, such knowledge is specified in 
the form of a prior probability distribution P(c), which leads to the following definition of the MAP estimator 

cmap = argmax P(c | g) = argmax P(g\c) P{c) , (4) 

c c 

where P(g \ c) represents the likelihood function pertaining to the observation model Q. In the case of Poisson 

noises, the likelihood function of a single measurement <7(xj) at pixel Xj is given by 

e -(A[c]W A r ns(x*) 

10 = ^r^, (5) 

where (A[c])j denotes the x^-th coordinate of A[c] and g(xi) € Z + is interpreted as the "number of counts" (e.g. 
the number of gamma photons registered by a gamma camera in nuclear imaging). Assuming that the values of 
g(xi) are independent and identically distributed (i.i.d.), the joint probability of the observed image g is given by 

NM e -(A[c]W A r c n»(xO 
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To complete the model, the coefficients of c = {c^kex are assumed to be identically distributed according to a 
Generalized Gaussian (GG) probability law |48), so that the joint probability of c can be defined as 

P(c) = TT V - e -(M/« P (7) 

While (3 > controls the variance of Ck, the value of < p < oo determines the appearance of c in terms its 
sparsity (with smaller values of p resulting in "heavier" tails of the corresponding probability density function 
(pdf), thereby allowing larger values of Ck to be occasionally drawn). In particular, the choice of p = 1 results in a 
Laplacian distribution, which is commonly used to describe the behavior of sparse representation coefficients [37], 
[03). Subsequently, the final expression for the MAP estimate becomes 



cmap = argmax 



It is conventional to convert the maximization problem ([8]) into a minimization problem through applying the 
log-transform to the posterior probability in ([S]), followed by inverting the sign of the expression thus obtained. In 
this case, the MAP estimate can be re-expressed as given by 

cmap = argmin{.E(c)} , 

C 

B(c) = <l,A[c])-07,log(A[c]))+ 7 ||c||» 7 = V/? p , (9) 

where 1 stands for an N x M matrix of ones, ( • , • ) stands for the standard inner product in R N * M and ||c||g = 
|cfc| p is the i'p-norm of the representation coefficients. 

fcez 

It should be noted that the likelihood model of ^ interprets the true image value /(xj) as the mean value of 
the corresponding random observation #(xi). Moreover, since in the case of the Poisson distribution, the first and 
the second moments of the distribution are equal, the values /(Xj) should be assumed to be nonnegative. The latter 
assumption restricts the domain of E(c) in ^ to be now defined as 

dom E = {c e £ 2 (Z) \ ®[c] h 0} . (10) 

Note that E(c) is convex over dom E. Moreover, it is strictly convex over dom E if p > 1 or g y and A 
is non-degenerate (albeit, possibly, ill-conditioned). In the latter case, the functional E(c) in ^ admits a unique 



minimizer (note that the set dom E is convex) |49 Ch. 11], which can be found by any algorithm which is 
guaranteed to converge to a stationary point of E(c). On the other hand, when either g is not strictly positive or 
A possesses a non-trivial null-space, the convexity of E(c) is not strict. In such a situation, the uniqueness of a 
minimizer of E(c) for p = 1 (which is the case of primary concern in this paper) can be re-established by replacing 
the £i-norm in ^ by its smooth relaxation, e.g., by (1, \J c? + e), for some e -C 1 [42]. It should be noted, however, 
that as long as practical aspect of the image reconstruction is concerned, it has been observed that, for the case of 
the method proposed in this paper, the above relaxation seems to be unnecessary even in the case of degenerate H 
and g > 0. 
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It should be noted that the problem of minimizing E(c) over the set defined by ( 10 1 does not necessarily have 
to be formulated as a constrained minimization problem. This is because, for nonnegative data images, the term 
— {g, log(A[c])) in (|9]) works similar to a log-barrier function which forces the solution to stay within the feasible 
region defined by the condition <&[c] y 0. In fact, strong theoretical guarantees for the existence and attainability 



of a unique minimizer of E(c) follow directly from the theory of interior-point methods |49 Ch.ll]. 

A possible solution to computing cmap could be by means of standard optimization techniques, such as gradient- 
based methods [50]. It was recently argued, however, that such general purpose tools might be ineffective when 
applied to the problem at hand, since they do not exploit properly the sparse structure of the desired solution |32) . 
In the following section, a different approach to finding cmap is detailed based on the methodology of iterative 



shrinkage [32]-[34], [42]. 



III. Bound optimization and iterative shrinkage 

The non-differentiability of the £ p -norm for the case p = 1 (which forms the main focus of this paper) makes it 
impossible to solve (|9]l by means of gradient-based optimization methods. Fortunately, an effective solution to the 
problem can be derived using the majorization-minimization (MM) approach of [51 1 (also known as the method 
of bound optimization |52j). The first step in applying this method consists in replacing the original problem of 
minimizing E(c) by a problem of minimizing a surrogate functional Q(c, cq) (with cq € t-x^C) being an arbitrary 
set of coefficients). Specifically, let the new functional Q(c, Co) have the following form 

Q(c,Co)=£(c) + *(c,c ), (11) 

where 

*(c, Co ) = ( g ,log(A[c]/A[c ])) - (A* [g/A[c }},c-c ) e2(x) + ^\\c-c \\l (12) 

with the "slash" to be interpreted as an element-wise division, A* being the adjoint of A, and (■, -}g 2 (i) standing 



for the inner product in ^(X)- It goes without saying that the surrogate functional in ( 12 1 has to obey a number of 
critical constraints which will be detailed in Section |IV] 

To minimize Q(c, cq) with respect to c, its gradient is computed first according to 

VQ(c,c ) = A*[l] - A* [g/A[c]] + P1 |c| p - 1 sign( C )+ (13) 
+A* [g/A[c]} - A* [g/A[c ]} + M (c - cq). 
Subsequently, canceling the similar terms with opposite signs and equating the gradient to zero yields 

c + P2\ c \P-hign(c) = c + - A* [g/A[c ] - 1] . (14) 
It should be noted that the above analysis is relevant to the differentiable case of p > 1. For the case of p = 1, the 



first order optimality condition ( 14 1 has to be replaced by its more general version requiring 



c+Zd\\c\\ 1 3c + -A*\g/A[c ]-l], (15) 

[A [L 
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where 9||c||i denotes the subdifferential of ||c|| i. Remarkably, both ( 14 1 and ( 15 i have the same closed form solution. 



Particularly, given the inverse 5 Pi7)/1 (c) of the function on the left-hand side of (14i, the optimal c that minimizes 
Q(c, c ) is given by 



CO 



1 



.9 - A[c ] 
A[ Co ] 



(16) 



The functional in ( |T6] > is analogous to the functional used in the existing iterative shrinkage methods, which have 



been derived under the assumption of Gaussian noises |32|-|34|, |42|. Just like in the latter studies, we propose to 
find a minimizer of the original functional in Q iteratively, viz. 



1 A* 

-A* 



A[q] 



(17) 



where t stands for the iteration index. The convergence of the above iteration scheme to a global minimizer of ([9]) 
is proven in the next section. Before turning to the proof, we note that setting p = 1 seems to be a reasonable 
choice in practice, since it guarantees the convexity of E{c) on one hand, and leads to a sparse solution on the 
other. It is worthwhile noting that, in this case, the function <S Pj7jA1 becomes the notorious soft thresholding rule 
(43), which is defined as 

(|c| - 7/M) sign(c), \l\c\>j/n 
0, otherwise. 
Finally, it is worthwhile noting that the computational cost in ( fT7] > is mainly determined by the cost of applying 
the composition transform A = H • $ and its adjoint. While the convolution operator can be computed efficiently, 
e.g., by means of the fast Fourier transform [53 1, the application of 3? and 4>* depends on the type of transformation 



Si, 7 ^(c) = 



(18) 



in use. Fortunately, most of the relevant transforms (such as wavelet [54], ridgelet [55], and curvelet (39) transforms) 
admit computationally efficient implementations, which are typically of a logarithmic complexity at most. A formal 
comparison between the computational efficiencies of the proposed and reference methods is given in Section V. 



IV. Convergence analysis 

To demonstrate that the proposed algorithm constitutes a viable alternative to traditional approaches, its conver- 
gence properties need to be analyzed next. As was shown in (51) , the convergence can be guaranteed provided that 
the surrogate functional Q(c,Ct) satisfies the following conditions: 



1) Q(c,c t ) > E(c), Vc S dom£ 

2) Q(c t ,c t ) = E(c t ) 



(19) 



with c t being an arbitrary but fixed reference point in domfi. 

Before verifying the above conditions, it is instructive to show first that minimizing Q(c,c t ) with respect to c 
yields a reduction in the value of the original functional E(c). To this end, let ct+i denote a minimizer of Q(c,Ct), 



i.e. ct+i — argmin c Q(c, ct). For such a c t +\ and Q(c,Ct) obeying (19 1, one then obtains 

E{c t+1 ) = Q(c t+1 ,c t ) + E(c t+1 ) - Q(c t +i,c t ) < Q(c t ,c t ) + E{c t ) - Q{c t ,c t ) = E(c t ), (20) 
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resulting in 

E{c t+1 ) < E(ct). (21) 



Thus, the bound-optimization algorithm ( 17 1 guarantees a reduction in the value of E(c t ) at each iteration provided 



Q(c,c t ) obeys the conditions in (19 1. It should be noted, however, that for the value of E{c t+ i) to decrease, it is 
sufficient to require that Q(c,c t ) > E(c) only at c = c t +\. The latter seems to be much less restrictive condition 
as compared to Condition 1 in ( [T9| ). (We will take advantage of this fact later in this section.) 



The fact that Condition 2 in (19i is satisfied by the proposed surrogate functional Q(c, Ct) can be verified by 
direct substitutions. Moreover, it is important to point out that the function ^> in ( fTT) obeys 

*(ct,ct)=0 and W(c,c 4 )| c=c( = 0, (22) 

which suggests that ^(c, c t ) has an extremum at c = c t that is equal to zero. Consequently, if ^(c, c t ) is a convex 
function, Condition 1 will be automatically fulfilled. The convexity, on the other hand, can be determined from the 
properties of the Hessian operator corresponding to <3>(c, c t ), which can be shown to be equal to 

V 2 *(c) = // 1 — A* diag [ 9 ) A. (23) 



v(A[ C ]) 2 7 

Note that, in the expression above, I stands for the identity operator, while the diagonal operator diag (g/(A[c]) 
is defined in the standard way as 



dia H(Awr ] "(AW^ (24) 

for any arbitrary y £ M. NxM . Provided the elements of g/(A[c]) 2 are bounded and A is compact, the Hessian in 



(23 i is necessarily a self-adjoint and compact operator. Moreover, the minimial eigenvalue \ m i n of the Hessian can 
be shown to be bounded by 

m ax 

(A* A), (25) 

where || ■ stands for the supremum norm and A max (A*A) denotes the maximum eigenvalue of A* A. Thus, as 
long as 

H> || 5 /(A[c]) 2 || oo A ma:c (A*A), (26) 



the Hessian V W(c) is positive definite, in which case ^(c, Ct) is convex, and hence Condition 1 of ( 19 1 is satisfied. 
It should be pointed that the finiteness of fi is always guaranteed by the fact that c G dom£. The value of fi 



in (26 1, however, is defined as a function of c, and therefore it would be quite problematic (if possible at all) to 



determine fx from (26i, had we decided to do so. In this sense, the condition (26 1 should be regarded as merely a 
"proof of existence". 

It turns out that in practical cases there are much simpler means to determine a value of a that guarantees a 



reduction in the value of E{c) with respect to E(c t ). First, we note that according to (20 1, it is sufficient to find a 
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that results ir[[] 



yllct+i-CtHl > (A* [,9/A[c t ]] ,c t+1 -c t ), 2(I) - (g, log (A[c t+1 ]/A[c t ])), (27) 
thereby guaranteeing that ^(ct+i, Ct) > 0, and therefore E(c t +i) < E(c t ). This implies the following practical 



way to find an acceptable fi t . Let the right-hand side of (27i be denoted by F(c t +i, c t ), which is a computable 
quantity provided the values of c t and Ct+i- Consequently, a suitable value of fi t can be found using the following 
algorithm. 



Algorithm 1 Finding a suitable scaling parameter p, t 

l: Preset: v = 1, < a < 1 

2: Compute: c t+ i = Si, 7 ,i (c t + A* [g/A[c t ] - 1]) 

3: if ||c t - Ct+ilH > 2F(c t+1 ,c t ) then 

4: while f \\c t - ct+ilH > 2F(ct+i,ct) do 

5: f <^ai/ 

6: ct+i = 5i l7 ,„ (ct + ^A* [g/A[ct] - 1]) 
7: end while 
8: fl t = v/a 

9: else 

10: while v\\c t -ct+ilH < 2i^(c f+ i,c t ) do 

11: V <f= vjct 

12: ct+l = Si l7 ,„ (ct + ^A* [ff/A[ct] - 1]) 

13: end while 

14: Mi = ^ 

15: end if 

16: return /itt 



Algorithm 1 has been designed to find a minimal possible fi (with the accuracy of log a in the logarithmic scale) 
that guarantees thaQ E(c t+ i) < E(c t ). It should be emphasized that the proof of existence in (26 1 assures that an 



acceptable value of fi t can always be found. The form of the shrinkage operator in ( fT7) i suggests that smaller values 
of fit result in more substantial shrinkage, which in turn leads to more sizable changes in c t +\ with respect to Ct. 
The idea of maximizing the effect of shrinkage through minimizing the value of fi t lies in the heart of the method 



proposed in |56|. In the present case, the minimality of fi t is guaranteed by the design of Algorithm 1, whose only 



downside is in the extra calculations required. In practical scenarios, however, when one is working with a pool 

'The subscript t is added intentionally in fit to indicate that its value can be iteration-dependent. 

2 The accuracy can be improved via choosing f3 to be close to 1, which has a drawback of slowing down the convergence. 
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of images of similar nature, Algorithm 1 can be excluded by predefining the value of /i. This can be done, for 
example, by applying Algorithm 1 to several images and determining a fixed value of /i that satisfies (|27|. 



V. Results 

A. Reference methods 

In the experimental part of this paper, the proposed method is referred to as Poisson Iterative Shrinkage (PIS). 
As was argued earlier, PIS is conceptually parallel to the iterative shrinkage method developed in [33) , J42) under 
the assumption of Gaussian noises, in which case the iterative shrinkage (referred to below as Gaussian Iterative 
Shrinkage (GIS)) has the form of 

ct+i = 5 P)7(/1 (ct + ^A* [g - A[c t ]]j (28) 

with 7 = cr 2 /(3 where ft is the bandwidth parameter of the GG pdf in ^ and a 1 is the variance of the i.i.d. Gaussian 
noise contaminating the (blurred) measurements of the true image. The GIS algorithm is known to converge to the 
global minimizer of a MAP criterion provided /i > ||A*A|| J33~) . 

Another reference method used in the comparison is the Richardson-Lucy (RL) algorithm, which is represented 
by the following iterative scheme 



ft+i = ft H* 



<J 



(29) 



MM 

Note that, as opposed to the GIS and PIS methods, the iterations in RL are performed on an estimated image f t 



rather on the coefficients of its representation in a basis/frame. The iteration procedure in (29 1 is derived from a 
maximum-likelihood (ML) model under the assumption of Poisson noises. ML estimators, however, are known to 
result in degraded performance in the case of poorly conditioned H. To alleviate this deficiency, fl4| proposed to 



use the MAP framework to regularize the convergence in (29 1. This method assumes the true / to be in the space 
of bounded-variation images and, thus, the resulting iterative scheme minimizes the total variation (TV) norm of 
the estimate of /. Using the methodology of RL, the above minimization can be performed via 



/t+i i-'Awh)** [mft 



(30) 

where 7 is a regularization parameter, which is to obey 1 — 7 div ( n^^p ) >- to guarantee the reconstruction to 



be non-negative. Below, the algorithm of (30 1 is referred to as the Richardson-Lucy total variation (RLTV) method. 

A different reference method that also takes advantage the TV regularization is (T7J. This method is based 
on minimizing the same objective function as RTLV, subject to a non-negativity constraint on /. The solution in 
p7| uses the variable splitting technique of [ 1 6 1 , which allows reducing the minimization problem to a few simpler 
subproblems. In what follows, the method is referred to as PIDSplit+ (which stands for Poisson image deconvolution 
by variable splitting with a positivity constraint). 

Another reference approach used in the present study is that of [19|. Similarly to the method proposed in the 
present paper, [ 19] assumes / to be sparsely representable in the domain of a linear transform. Subsequently, the 
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algorithm is initialized by applying a VST (namely, the Anscombe transform) to the data image g, followed by 
recovering / as a solution to a standard £2 — t\ minimization problem. In the present study, the above method has 
been implemented using a publicly available code (see |http : / /www . greyc . ensicaen . f r/~f dupe/j j). Due 
to the fact that [19] incorporates a VST with the theory sparse representations, in the discussion that follows, this 
method is referred to as VSTSR. 

The last reference method used in our comparative study is the one described in pT| . The method employes the 
sparsity assumption in the domain of an orthogonal wavelet transform. Subsequently, [21] solves the minimization 
problem of |9]) by minimizing a sequence of quadratic approximations to a log penalty function. Following pT) , 
the above method is referred to as sparse Poisson intensity reconstruction algorithm (SPIRAL). 

Since RL, RLTV and PIDSplit+ tend to become unstable in the case of poorly conditioned H, a common practice 
is to terminate their execution after a predefined number of iterations. In our experiments, their termination was 
performed at the point where the normalized mean-squared error (NMSE) (defined below) reached its minimum 
value. Needless to say that such termination is only possible under the conditions of controlled simulation studies, 
when the original images are known. In practical scenarios, however, the NMSE-optimal termination is generally 
impossible, which suggests that real-life reconstructions obtained with RL, RLTV and PIDSplit+ may be actually 
worse than the reconstructions demonstrated in the present paper. The proposed PIS algorithm, on the other hand, 
remains stable in the course of its convergence, which makes it possible to terminate the algorithm simply after a 
relative change in the value of E{c) drops below a predefined threshold (e.g., 10 -6 ). 

It is worthwhile noting that the reference methods above have been derived using different statistical approaches 
and assumptions. The motivation behind choosing these methods has been to compare the proposed method to the 
approaches based on the Poisson model (i.e. RL, RLTV, PIDSplit+, VSTSR, SPIRAL) as well as to those exploiting 
the idea of iterative shrinkage (i.e. GIS). The main properties of all the reconstruction methods under consideration 
are summarized in Table U 

TABLE I 

Properties of the reconstruction methods under comparison 





Statistical framework 


Prior model 


Noise model 


RL 


ML 


N/A 


Poisson 


RLTV 


MAP 


BV image / 


Poisson 


PIDSplit+ 


MAP 


BV image / 


Poisson 


VSTSR 


MAP 


GG coefficients c 


Poisson 


SPIRAL 


MAP 


Laplacian coefficients c 


Poisson 


GIS 


MAP 


GG coefficients c 


Gaussian 


PIS 


MAP 


GG coefficients c 


Poisson 
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TABLE II 

Computational complexity of the reconstruction methods under comparison 





Computational operations per iteration 


PIS 


2C(n) + 2R(n) + 0(n) 


GIS 


2C{n) + 2R(n) + 0(n) 


RL 


2C{n) + 0(n) 


RLTV 


2C{n) + 0(n) 


PIDSpIit+ 


4C(n) + 0(n) 


VSTSR 


2C{n) + 0{R(n)) 


SPIRAL 


2C(n) + 0(n) 



B. Computational complexity 

The most computationally demanding operations used in the proposed and reference methods are associated with 
the convolution H and frame €> operators. The convolution operator H and its adjoint can be efficiently implemented 
by using, e.g., the FFT algorithm, which requires n\ogn MAC operations, with n = NM representing the total 
number of pixels. Since the implementation of convolution and its related complexity depend on H itself, let C(n) 
represent the total number of MAC operations required by computing H and H*. In a similar manner, let R(n) 
represent the total number of MAC operations required by applying the frame operator 3? and its adjoint. (Thus, 
for example, R(n) ~ 0(n) in the case of 3?* being an orthogonal wavelet transform). Consistent with the notations 
above, Table [IT] summarizes the computational complexities required by one iteration of the proposed and reference 
methods. Needless to say, the overall complexity of the above methods will depend on the number of iterations 
required till their convergence. These numbers will be provided below for specific examples of image reconstruction. 



C. Comparison measures 

The algorithms listed in Table [I] have been compared in terms of the NMSE defined as follows. Let / be a true 
image and / be an estimate of /. Then, the NMSE can be defined as 



11/ -/I 



2 



NMSE = nTnT|' <31) 

with || • \\p being the Frobenius matrix norm, and £ being the operator of expectation. In the current study, the 
latter is approximated by sample mean based on the results of 200 independent trials. 

It has been recently argued that the NMSE may not be an optimal comparison measure as long as human visual 
perception is concerned. For this reason, the NMSE-based comparison has been complimented by comparing the 



reconstruction algorithms in terms of the SSIM index as suggested in [35 1. 
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D. Sparse deconvolution 

Our first example is concerned with the problem of sparse deconvolutiorj^J in which case A = H. In this task, 
our main intension is to demonstrate the importance of regularization and correct modeling of noise for successful 
reconstruction of / in ([TJ. Consequently, in this subsection, the performance of the PIS algorithm is compared to 
those of RL and GIS. 

The assumption of sparsity suggests that / consists of a small number of bright sources scattered over a black 
background. An example of such an image is shown in the upper-right subplot of Fig.[T] where the non-zero samples 
of / have been generated by taking the absolute value of an i.i.d. Gaussian random variables. 




Fig. 1. (First row of subplots) Original image, blurred image, and noisy image (SNR=17); (Second row of subplots) RL reconstruction, GIS 
reconstruction, and PIS reconstruction; (Third row of subplots) Zoomed segments of the original and reconstructed images as indicated by the 
dashed rectangles. 

The blurring artifact was simulated by convolving the test images with an isotropic Gaussian kernel h(x) whose 
-3 dB cut-off frequency was set to be equal to 0.2n. As the next step, the resulting images were contaminated by 

3 In this case, the image / is identified with its coefficients c, i.e. / = c. 
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Fig. 2. (First row of subplots) Original image, blurred image, and noisy image (SNR=5.1); (Second row of subplots) RL reconstruction, GIS 
reconstruction, and PIS reconstruction; (Third row of subplots) Zoomed segments of the original and reconstructed images as indicated by the 
dashed rectangles. 



Poisson noise. In this regard, it should be noted that the level of Poisson noise is defined by the corresponding value 
of the blurred image H[/]. Since real-life images are always contaminated by background noises, the minimum 
value of H[/] should be strictly positive. In this study, a set of three different minimum (background) values, 
namely 15, 30, and 50, were used. Note that since the variance of Poisson noise is equal to its mean value, higher 
background values will result in more severe noises. In this case, it seems to be reasonable to define the SNR as 
a ratio of the maximum value of H[/] (i.e. 255) to its background value. According to this definition, the SNR 
values used in the present study were 17, 8.5, and 5.1. Two examples of simulated data images for SNR=17 and 
SNR=5.1 are shown in the upper-left subplots of Fig. [T] and Fig. |2j respectively. 

In the case of low-pass blurs, the only possibility for H[/] to generate a constant background is to require /(x) ^ 
for all x € fi. This obviously contradicts the assumption on / to be sparse. To alleviate this deficiency, we suggest 
to modify the image formation model via replacing H and / by H = [H 1] and / = [/ fo] T , respectively, where 
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/ e H KxL is assumed to be sparse, and / G M + \{0} is a positive scalar defining the background level. In this 
case, the image formation model of ([TJ can be redefined as 

g = v{n[f]]=V{U[f] + f }. (32) 

Consequently, the reconstruction is applied with H to recover /, in which case the true image is considered to be 
equal to the sum / + f . 
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Fig. 3. (Upper row of subplots) The NMSE of GIS reconstruction as a function of the number of iterations for SNR=17, 8.5 and 5.1; (Middle 
row of subplots) The NMSE of RL reconstruction as a function of the number of iterations for SNR=17, 8.5 and 5.1; (Lower row of subplots) 
The NMSE of PIS reconstruction as a function of the number of iterations for SNR=17, 8.5 and 5.1. 



The above model adjustment was applied only for the cases of GIS and PIS reconstruction, which are based on the 
sparsity assumption. The regularization parameter 7 in ^ and (28 1 was defined to be 1//3 and a 2 //3, respectively, 
with f3 — 4.5 and a 2 equal to the sample variance of the background noise. In the case of GIS, the parameter /1 
was set to be equal to 1.1 • ||H*H||, while in the case of PIS it was chosen using Algorithm 1 with a = 0.8. 

Typical reconstruction results are demonstrated in Fig. [T] and Fig.|2]for SNR=17 and SNR=5.1, correspondingly. 
In particular, the middle row of subplots of the figures show the reconstructions obtained by the (from left to right) 
RL, GIS, and PIS algorithms. For the convenience of the reader, the bottom row of subplots in Fig. [T] and Fig. [2] 
show zoomed fragments of the original and recovered images as indicated by the dashed boxes and letters A, B, 
C, and D. One can see that PIS outperforms all the reference methods in terms of the resolution improvement and 
noise reduction. 

A quantitative comparison of the reconstruction algorithms is presented in Fig. [3] which shows the NMSE as a 
function of the number of iterations for (from up to down) GIS, RL, and PIS, and for different values of SNR, 
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namely (from left to right) 17, 8.5, and 5.1. It should be noted that each value of the NMSE in Fig. [3]is a result of 
averaging the errors obtained in a series of independent trials, where both the true images and noises were drawn 
randomly. 

Observing Fig. [3] one can see that PIS results in considerably lower values of the NMSE as compared to the 
reference methods. As well, it converges to a steady-state solution after a much smaller number of iterations as 
compared to the GIS algorithm (i.e. 500 vs. 10 4 ). Unlike the RL method, which is non-monotonely convergent 
in NMSE (which can not be seen in Fig. [3] since the method was terminated before the algorithm diverged), the 
convergence of PIS is monotone in both E(c) and NMSE. 

E. Sparse reconstruction 

In the second part of the experimental study, the PIS method was tested in application to the problem of sparse 
image reconstruction, where the combined operation of image synthesis and blur is represented by A = H$. In 
this case, the blur model was defined by the convolution kernel h[i,j] = (i 2 + j 2 + with i,j = —D, . . . , D 
and D £ {2,7} (321. The frame operator 3? was defined to describe the translation invariant (TI) wavelet transform 
corresponding to the Haar wavelet. The number of wavelet resolutions was set to be equal to 4. As in the case 
with sparse deconvolution, the wavelet frame was extended by adding a constant vector so as to allow the image 
background to be modeled by a single element of the frame. 

In this subsection, image reconstructions produced by the proposed and reference methods are tested using a 
microscopic image of glomerulus and the standard Shepp-Logan phantom, which are shown in Subplots A of 
Fig. |4] and Fig. [5] respectively. Similar to the case of sparse reconstruction, the images have been offset by a 
constant (background) value to give rise to different values of SNR. In particular, the value was adjusted to result 
in SNR equal to 32 (moderate noises) and 8 (strong noises). The original, blurred, and contaminated images of the 
glomerulus and Shepp-Logan phantom are summarized in Fig. [4] and Fig. [5] for all the tested values of D and SNR. 

TABLE III 

NMSE AND SSIM VALUES OF THE RECONSTRUCTION METHODS UNDER COMPARISON USING THE SHEPP-LOGAN PHANTOM. 





L=2, SNR=32 


L=2, SNR=8 


NMSE 


SSIM 


NIT 


MNSE 


SSIM 


NIT 


RL 


0.101 


0.68 


10 


0.368 


0.54 


10 


RLTV 


0.141 


0.72 


10 


0.400 


0.71 


10 


PIDsplit+ 


0.123 


0.85 


5 


0.872 


0.68 


5 


VSTSR 


0.189 


0.65 


300 


0.377 


0.65 


300 


SPIRAL 


0.132 


0.76 


20,000 


0.378 


0.72 


20,000 


PIS 


0.09 


0.88 


500 


0.377 


0.84 


500 



The reference methods used in this section were RL, RLTV, PIDsplit+, VSTSR and SPIRAL. It should be noted 
that the GIS method has been excluded from the current experiment, whose inappropriateness of statistical model 
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ORIGINAL 




BLURRED (D=2) MEASURED <D = 2, SNR=32) 




Fig. 4. (Subplot A) Original image of glomerulus; (Subplot B) Blurred image of glomerulus with D=2; (Subplot C) Blurred and noisy image 
of glomerulus with D=2 and SNR=32; (Subplot D) Blurred image of glomerulus with D=7; (Subplot E) Blurred and noisy image of glomerulus 
with D=7 and SNR=8. 



makes it a poor candidate for comparison (as demonstrated by the results of the previous section). The regularization 
parameters of VSTSR, SPIRAL and PIS were set empirically to be equal to 0.05, 0.1 and 0.02, respectively. The 
regularization parameters of RLTV and PIDsplit+ were set to be equal to 0.002 and 0.01 according the guidelines 
provided in (14] and fT7) . 

The SPIRAL algorithm was applied with the orthogonal Haar wavelet transform (as opposed to its stationary 
version used by PIS), according to the requirements specified in pTj . In this case, to alleviate the artifacts caused 



by the property of the orthogonal wavelet transform being translational variant, the cycle-spinning algorithm of j 23 1 
was employed, with a total number of cycles set to be equal to 20. 
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BLURRED (D=2) MEASURED tf} = 2, SNR = 32) 




Fig. 5. (Subplot A) Original image of the Shepp-Logan phantom; (Subplot B) Blurred image of the Shepp-Logan phantom with D=2; (Subplot 
C) Blurred and noisy image of the Shepp-Logan phantom with D=2 and SNR=32; (Subplot D) Blurred image of the Shepp-Logan phantom 
with D=7; (Subplot E) Blurred and noisy image of the Shepp-Logan phantom with D=7 and SNR=8. 

In the case of the VSTSR, SPIRAL and PIS algorithms, their execution was terminated automatically at the 
point when the relative change ||/t+i — ft\\F/\\ft\\F between iterations t and t + 1 was observed to drop below a 
threshold of 10~ 6 . Unfortunately, the same stopping criterion could not be applied to the RL, RLTV, and PIDsplit+ 
methods, whose steady-state estimation was found to be unacceptably noisy. For this reason, these algorithms were 
terminated earlier, at the point when their corresponding NMSE reached their minimum values. It should be noted 
that, since the computation of NMSE requires the knowledge of a true image, the "NMSE-optimal" convergence 
cannot be considered as a practical tool. Therefore, the results of RL, RLTV, and PIDsplit+ methods reported in 
this section may not be reproduced in a real-life scenario. 
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TABLE IV 

NMSE AND SSIM VALUES OF THE RECONSTRUCTION METHODS UNDER COMPARISON USING A GLOMERULUS IMAGE; NMSE VALUES 

APPEAR AFTER MULTIPLICATION BY 10 2 . 





L=4, SNR=32 


L=7. SNR=8 


NMSE 


SSIM 


NIT 


MNSE 


SSIM 


NIT 


RL 


0.280 


0.86 


10 


0.772 


0.77 


10 


RLTV 


0.275 


0.87 


10 


0.755 


0.78 


10 


PIDsplit+ 


42.54 


0.90 


4 


47.51 


0.78 


4 


VSTSR 


9.150 


0.78 


300 


9.593 


0.62 


300 


SPIRAL 


0.303 


0.86 


50,000 


0.737 


0.81 


50,000 


PIS 


0.210 


0.92 


400 


0.729 


0.86 


400 




For the case of glomerulus, the reconstructions obtained with the proposed and reference methods are summarized 
in Fig. |6] (for D = 2, SNR=32) and Figj7](for D = 7, SNR=8). Moreover, Fig.[8]and Fig.|9]depict the reconstructions 
of the Shepp-Logan phantom for the cases of D = 2, SNR=32 and D = 7, SNR=8, respectively. Analyzing these 
results, one can clearly see that, in all the above cases, the PIS algorithm yields reconstructions of superior quality 
(in terms of the resolution and contrast gain), as compared to the reference methods. This observation is further 



supported by the quantitative measures of Tables III and IV which compare the estimation results in terms of 
the NMSE, SSIM index, and the number of iterations. As evidenced by the tables, the PIS method produces the 
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Fig. 7. Image reconstruction results corresponding to Fig|4]with D=7 and SNR=8. (Upper row of subplots) RL, RLTV and PIDsplit+ estimates; 
(Lower row of subplots) VSTSR, SPIRAL and PIS estimates. 

lowest NMSE and the largest SSIM index among all the methods under comparison. As to the number of iterations 
required by PIS, one can see (with a reference to Table [II]) that the method has a computational complexity either 
comparable or lower than that of the reference methods. 

VI. Discussion and Conclusions 

In the current paper, a new approach to the problem of denoising/reconstruction of digital images was presented. 
The method has been derived based on the framework of MAP estimation under the assumption of Poisson noise 
contamination. Such noise models are known to be standard in many important image modalities, including optical, 
microscopic, turbulent, and nuclear imaging, just to name a few. Moreover, whilst many of the existing solutions 
to the problem of enhancement of Poissonian images take advantage of certain simplifying assumptions about the 
noise nature, the proposed technique is optimized to deal with the realistic noise model at hand. 

Another advantage of the proposed method consists in the generality of its formulation. The latter allows applying 
the same reconstruction procedure to a number of different settings, such as image denoising or image enhancement 
through deconvolution. Furthermore, the prior assumptions made by the method regarding the nature of recovered 
images are general as well. Specifically, the images are assumed to admit a sparse representation in the domain of 
a properly chosen linear transform. Note that the reasonability of the above a priori modeling is firmly supported 
by the recent advances in the theory of sparse representation. 

Yet another critical advantage of the proposed PIS algorithm is in its algorithmic structure, which exploits the 
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Fig. 8. Image reconstruction results corresponding to Fig[5]with D=2 and SNR=32. (Upper row of subplots) RL, RLTV and PIDsplit+ estimates; 
(Lower row of subplots) VSTSR, SPIRAL and PIS estimates. 



RL 
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Fig. 9. Image reconstruction results corresponding to Fig|3]with D=7 and SNR=8. (Upper row of subplots) RL, RLTV and PIDspliW- estimates; 
(Lower row of subplots) VSTSR, SPIRAL and PIS estimates. 
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idea of iterative shrinkage. The latter allows solving non-smooth optimization problems at the computational cost 
of a steepest descent procedure. Consequently, the computational load required by the proposed method is relatively 
small, which allows the method to be applied for the solution of large-scale problems and/or for processing of large 
data sets. 

It was shown conceptually and experimentally that the performance of the PIS algorithm is superior to a number 
of alternative approaches. A series of comparison tests have been performed, in which PIS was shown to outperform 
the reference methods in terms of both NMSE and SSIM index measures. Moreover, as opposed to the alternative 
methods, the PIS algorithm has always been capable of converging in a stable and robust manner to a useful 
reconstruction result. 

We believe that the convergence speed of PIS iterations can be further improved via employing the line search 
strategy as detailed in J56| . It seems also to be possible to increase the accuracy of PIS estimation by using more 
sparsifying multiresolution transforms as compared to separable wavelets J57) . Finally, we note that the sparseness 
of representation coefficients appears to be a rather weak constraint to be used in the case of poorly conditioned 
convolution operators H. This problem, therefore, should be resolved by using different prior models, whose 
formulation constitutes one of the directions of our current research. 
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